algorithmic complexity
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Export Reviews, Discussions, Author Feedback and Meta-Reviews
First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. The key advancement in the model is the concept of subpopulation. For fitting the model, they propose sophisticated initialization procedures and compares methods. In other words, why is the matrix A not block diagonal? Wouldn't it allow any mixture factor model to be represented as well?
Binarized Neural Networks Converge Toward Algorithmic Simplicity: Empirical Support for the Learning-as-Compression Hypothesis
Sakabe, Eduardo Y., Abrahão, Felipe S., Simões, Alexandre, Colombini, Esther, Costa, Paula, Gudwin, Ricardo, Zenil, Hector
Understanding and controlling the informational complexity of neural networks is a central challenge in machine learning, with implications for generalization, optimization, and model capacity. While most approaches rely on entropy-based loss functions and statistical metrics, these measures often fail to capture deeper, causally relevant algorithmic regularities embedded in network structure. We propose a shift toward algorithmic information theory, using Binarized Neural Networks (BNNs) as a first proxy. Grounded in algorithmic probability (AP) and the universal distribution it defines, our approach characterizes learning dynamics through a formal, causally grounded lens. We apply the Block Decomposition Method (BDM) -- a scalable approximation of algorithmic complexity based on AP -- and demonstrate that it more closely tracks structural changes during training than entropy, consistently exhibiting stronger correlations with training loss across varying model sizes and randomized training runs. These results support the view of training as a process of algorithmic compression, where learning corresponds to the progressive internalization of structured regularities. In doing so, our work offers a principled estimate of learning progression and suggests a framework for complexity-aware learning and regularization, grounded in first principles from information theory, complexity, and computability.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.05)
- South America > Brazil > São Paulo (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- (5 more...)
- Research Report > New Finding (0.69)
- Research Report > Experimental Study (0.46)
Strongly Solving $7 \times 6$ Connect-Four on Consumer Grade Hardware
While the game Connect-Four has been solved mathematically and the best move can be effectively computed with search based methods, a strong solution in the form of a look-up table was believed to be infeasible. In this paper, we revisit a symbolic search method based on binary decision diagrams to produce strong solutions. With our efficient implementation we were able to produce a 89.6 GB large look-up table in 47 hours on a single CPU core with 128 GB main memory for the standard $7 \times 6$ board size. In addition to this win-draw-loss evaluation, we include an alpha-beta search in our open source artifact to find the move which achieves the fastest win or slowest loss.
- Europe > Austria > Vienna (0.14)
- Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
SuperARC: A Test for General and Super Intelligence Based on First Principles of Recursion Theory and Algorithmic Probability
Hernández-Espinosa, Alberto, Ozelim, Luan, Abrahão, Felipe S., Zenil, Hector
We introduce an open-ended test grounded in algorithmic probability that can avoid benchmark contamination in the quantitative evaluation of frontier models in the context of their Artificial General Intelligence (AGI) and Superintelligence (ASI) claims. Unlike other tests, this test does not rely on statistical compression methods (such as GZIP or LZW), which are more closely related to Shannon entropy than to Kolmogorov complexity. The test challenges aspects related to features of intelligence of fundamental nature such as synthesis and model creation in the context of inverse problems (generating new knowledge from observation). We argue that metrics based on model abstraction and optimal Bayesian inference for planning can provide a robust framework for testing intelligence, including natural intelligence (human and animal), narrow AI, AGI, and ASI. Our results show no clear evidence of LLM convergence towards a defined level of intelligence, particularly AGI or ASI. We found that LLM model versions tend to be fragile and incremental, as new versions may perform worse than older ones, with progress largely driven by the size of training data. The results were compared with a hybrid neurosymbolic approach that theoretically guarantees model convergence from optimal inference based on the principles of algorithmic probability and Kolmogorov complexity. The method outperforms LLMs in a proof-of-concept on short binary sequences. Our findings confirm suspicions regarding the fundamental limitations of LLMs, exposing them as systems optimised for the perception of mastery over human language. Progress among different LLM versions from the same developers was found to be inconsistent and limited, particularly in the absence of a solid symbolic counterpart.
- Europe > Belgium (0.04)
- South America > Brazil (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- (8 more...)
- Education (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.92)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)
Sparks of Explainability: Recent Advancements in Explaining Large Vision Models
This thesis explores advanced approaches to improve explainability in computer vision by analyzing and modeling the features exploited by deep neural networks. Initially, it evaluates attribution methods, notably saliency maps, by introducing a metric based on algorithmic stability and an approach utilizing Sobol indices, which, through quasi-Monte Carlo sequences, allows a significant reduction in computation time. In addition, the EVA method offers a first formulation of attribution with formal guarantees via verified perturbation analysis. Experimental results indicate that in complex scenarios these methods do not provide sufficient understanding, particularly because they identify only "where" the model focuses without clarifying "what" it perceives. Two hypotheses are therefore examined: aligning models with human reasoning -- through the introduction of a training routine that integrates the imitation of human explanations and optimization within the space of 1-Lipschitz functions -- and adopting a conceptual explainability approach. The CRAFT method is proposed to automate the extraction of the concepts used by the model and to assess their importance, complemented by MACO, which enables their visualization. These works converge towards a unified framework, illustrated by an interactive demonstration applied to the 1000 ImageNet classes in a ResNet model.
- Europe > Italy > Marche > Ancona Province > Ancona (0.04)
- Europe > Finland > Uusimaa > Helsinki (0.04)
- South America > Argentina > Patagonia > Tierra del Fuego Province > Ushuaia (0.04)
- (5 more...)
- Workflow (1.00)
- Research Report > Promising Solution (1.00)
- Research Report > New Finding (1.00)
- (3 more...)
- Information Technology > Security & Privacy (1.00)
- Government > Regional Government (1.00)
- Health & Medicine > Therapeutic Area > Neurology (0.92)
- (3 more...)
Decoding Geometric Properties in Non-Random Data from First Information-Theoretic Principles
Zenil, Hector, Abrahão, Felipe S.
Based on the principles of information theory, measure theory, and theoretical computer science, we introduce a univariate signal deconvolution method with a wide range of applications to coding theory, particularly in zero-knowledge one-way communication channels, such as in deciphering messages from unknown generating sources about which no prior knowledge is available and to which no return message can be sent. Our multidimensional space reconstruction method from an arbitrary received signal is proven to be agnostic vis-a-vis the encoding-decoding scheme, computation model, programming language, formal theory, the computable (or semi-computable) method of approximation to algorithmic complexity, and any arbitrarily chosen (computable) probability measure of the events. The method derives from the principles of an approach to Artificial General Intelligence capable of building a general-purpose model of models independent of any arbitrarily assumed prior probability distribution. We argue that this optimal and universal method of decoding non-random data has applications to signal processing, causal deconvolution, topological and geometric properties encoding, cryptography, and bio- and technosignature detection.
- North America > United States > New York > New York County > New York City (0.14)
- North America > Puerto Rico > Arecibo > Arecibo (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (6 more...)
- Health & Medicine (0.45)
- Information Technology > Security & Privacy (0.34)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.60)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)
Algorithmic Information Forecastability
Amigo, Glauco, Díaz-Pachón, Daniel Andrés, Marks, Robert J., Baylis, Charles
The outcome of all time series cannot be forecast, e.g. the flipping of a fair coin. Others, like the repeated {01} sequence {010101...} can be forecast exactly. Algorithmic information theory can provide a measure of forecastability that lies between these extremes. The degree of forecastability is a function of only the data. For prediction (or classification) of labeled data, we propose three categories for forecastability: oracle forecastability for predictions that are always exact, precise forecastability for errors up to a bound, and probabilistic forecastability for any other predictions. Examples are given in each case.
- North America > United States > California (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.47)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)
Optimal Spatial Deconvolution and Message Reconstruction from a Large Generative Model of Models
Zenil, Hector, Adams, Alyssa, Abrahão, Felipe S.
We introduce a univariate signal deconvolution method based on the principles of an approach to Artificial General Intelligence in order to build a general-purpose model of models independent of any arbitrarily assumed prior probability distribution. We investigate how non-random data may encode information about the physical properties, such as dimensions and length scales of the space in which a signal or message may have been originally encoded, embedded, or generated. Our multidimensional space reconstruction method is based on information theory and algorithmic probability, so that it is proven to be agnostic vis-a-vis the arbitrarily chosen encoding-decoding scheme, computable or semi-computable method of approximation to algorithmic complexity, and computational model. The results presented in this paper are useful for applications in coding theory, particularly in zero-knowledge one-way communication channels, such as in deciphering messages from unknown generating sources about which no prior knowledge is available and to which no return message can be sent. We argue that this method has the potential to be of great value in cryptography, signal processing, causal deconvolution, life and technosignature detection.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)
- North America > United States > New York > New York County > New York City (0.14)
- (12 more...)
- Information Technology > Artificial Intelligence > Natural Language > Generation (0.40)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)
Algorithmic Probability of Large Datasets and the Simplicity Bubble Problem in Machine Learning
Abrahão, Felipe S., Zenil, Hector, Porto, Fabio, Wehmuth, Klaus
When mining large datasets in order to predict new data, limitations of the principles behind statistical machine learning pose a serious challenge not only to the Big Data deluge, but also to the traditional assumptions that data generating processes are biased toward low algorithmic complexity. Even when one assumes an underlying algorithmic-informational bias toward simplicity in finite dataset generators, we show that fully automated, with or without access to pseudo-random generators, computable learning algorithms, in particular those of statistical nature used in current approaches to machine learning (including deep learning), can always be deceived, naturally or artificially, by sufficiently large datasets. In particular, we demonstrate that, for every finite learning algorithm, there is a sufficiently large dataset size above which the algorithmic probability of an unpredictable deceiver is an upper bound (up to a multiplicative constant that only depends on the learning algorithm) for the algorithmic probability of any other larger dataset. In other words, very large and complex datasets are as likely to deceive learning algorithms into a "simplicity bubble" as any other particular dataset. These deceiving datasets guarantee that any prediction will diverge from the high-algorithmic-complexity globally optimal solution while converging toward the low-algorithmic-complexity locally optimal solution. We discuss the framework and empirical conditions for circumventing this deceptive phenomenon, moving away from statistical machine learning towards a stronger type of machine learning based on, or motivated by, the intrinsic power of algorithmic information theory and computability theory.
- North America > United States > New York > New York County > New York City (0.14)
- South America > Brazil (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (3 more...)